{"cells": [{"cell_type": "markdown", "metadata": {}, "source": ["# 07.04 - TENSORFLOW"]}, {"cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["endpoint https://m5knaekxo6.execute-api.us-west-2.amazonaws.com/dev-v0001/rlxmooc\n"]}, {"data": {"text/html": ["

See my courses and progress

"], "text/plain": [""]}, "execution_count": 1, "metadata": {}, "output_type": "execute_result"}], "source": ["!wget --no-cache -O init.py -q https://raw.githubusercontent.com/rramosp/ai4eng.v1/main/content/init.py\n", "import init; init.init(force_download=False); init.get_weblink()"]}, {"cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": ["import numpy as np\n", "import matplotlib.pyplot as plt\n", "from local.lib import mlutils\n", "from IPython.display import Image\n", "\n", "try:\n", " %tensorflow_version 2.x\n", " print (\"Using TF2 in Google Colab\")\n", "except:\n", " pass\n", "\n", "import tensorflow as tf\n", "%matplotlib inline"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## A dataset (again)"]}, {"cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [{"data": {"text/plain": [""]}, "execution_count": 4, "metadata": {}, "output_type": "execute_result"}, {"data": {"image/png": "\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["## KEEPOUTPUT\n", "from sklearn.datasets import make_moons\n", "X,y = make_moons(300, noise=.15)\n", "plt.scatter(X[:,0][y==0], X[:,1][y==0], color=\"blue\", label=\"class 0\", alpha=.5)\n", "plt.scatter(X[:,0][y==1], X[:,1][y==1], color=\"red\", label=\"class 1\", alpha=.5)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## A neural network in tensorflow, 4 neurons in the hidden layer, 1 output"]}, {"cell_type": "code", "execution_count": 170, "metadata": {}, "outputs": [], "source": ["model = tf.keras.Sequential([\n", " tf.keras.layers.Dense(4, activation='tanh'),\n", " tf.keras.layers.Dense(1, activation='sigmoid')\n", "])\n", "model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=.5),\n", " loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),metrics=['accuracy'])"]}, {"cell_type": "code", "execution_count": 171, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Epoch 1/10\n", "300/300 [==============================] - 0s 480us/step - loss: 0.4007 - accuracy: 0.8267\n", "Epoch 2/10\n", "300/300 [==============================] - 0s 478us/step - loss: 0.3605 - accuracy: 0.8467\n", "Epoch 3/10\n", "300/300 [==============================] - 0s 469us/step - loss: 0.2444 - accuracy: 0.8900\n", "Epoch 4/10\n", "300/300 [==============================] - 0s 473us/step - loss: 0.1709 - accuracy: 0.9467\n", "Epoch 5/10\n", "300/300 [==============================] - 0s 470us/step - loss: 0.1837 - accuracy: 0.9467\n", "Epoch 6/10\n", "300/300 [==============================] - 0s 455us/step - loss: 0.1485 - accuracy: 0.9600\n", "Epoch 7/10\n", "300/300 [==============================] - 0s 467us/step - loss: 0.1072 - accuracy: 0.9633\n", "Epoch 8/10\n", "300/300 [==============================] - 0s 456us/step - loss: 0.0981 - accuracy: 0.9667\n", "Epoch 9/10\n", "300/300 [==============================] - 0s 472us/step - loss: 0.1252 - accuracy: 0.9600\n", "Epoch 10/10\n", "300/300 [==============================] - 0s 465us/step - loss: 0.1253 - accuracy: 0.9633\n"]}, {"data": {"text/plain": [""]}, "execution_count": 171, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "model.fit(X,y, epochs=10, batch_size=1)"]}, {"cell_type": "code", "execution_count": 173, "metadata": {}, "outputs": [{"data": {"text/plain": ["(0.5318, 0.4682)"]}, "execution_count": 173, "metadata": {}, "output_type": "execute_result"}, {"data": {"image/png": "\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["## KEEPOUTPUT\n", "predict = lambda X: (model.predict(X)[:,0]>.5).astype(int)\n", "mlutils.plot_2Ddata_with_boundary(predict, X, y)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## A bigger network\n", "\n", "- different activation functions\n", "- different optimizer"]}, {"cell_type": "code", "execution_count": 215, "metadata": {}, "outputs": [], "source": ["model = tf.keras.Sequential([\n", " tf.keras.layers.Dense(20, activation='tanh'),\n", " tf.keras.layers.Dense(50, activation='relu'),\n", " tf.keras.layers.Dense(1, activation='sigmoid')\n", "])\n", "model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=.01),\n", " loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),metrics=['accuracy'])"]}, {"cell_type": "code", "execution_count": 216, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Epoch 1/10\n", "300/300 [==============================] - 0s 510us/step - loss: 0.3400 - accuracy: 0.8567\n", "Epoch 2/10\n", "300/300 [==============================] - 0s 508us/step - loss: 0.2804 - accuracy: 0.8867\n", "Epoch 3/10\n", "300/300 [==============================] - 0s 487us/step - loss: 0.2450 - accuracy: 0.8967\n", "Epoch 4/10\n", "300/300 [==============================] - 0s 532us/step - loss: 0.1722 - accuracy: 0.9433\n", "Epoch 5/10\n", "300/300 [==============================] - 0s 516us/step - loss: 0.1660 - accuracy: 0.9500\n", "Epoch 6/10\n", "300/300 [==============================] - 0s 527us/step - loss: 0.0917 - accuracy: 0.9700\n", "Epoch 7/10\n", "300/300 [==============================] - 0s 516us/step - loss: 0.1127 - accuracy: 0.9667\n", "Epoch 8/10\n", "300/300 [==============================] - 0s 507us/step - loss: 0.1118 - accuracy: 0.9600\n", "Epoch 9/10\n", "300/300 [==============================] - 0s 526us/step - loss: 0.1076 - accuracy: 0.9500\n", "Epoch 10/10\n", "300/300 [==============================] - 0s 487us/step - loss: 0.0840 - accuracy: 0.9800\n"]}, {"data": {"text/plain": [""]}, "execution_count": 216, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "model.fit(X,y, epochs=10, batch_size=1)"]}, {"cell_type": "code", "execution_count": 217, "metadata": {}, "outputs": [{"data": {"text/plain": ["(0.522575, 0.477425)"]}, "execution_count": 217, "metadata": {}, "output_type": "execute_result"}, {"data": {"image/png": "\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["## KEEPOUTPUT\n", "predict = lambda X: (model.predict(X)[:,0]>.5).astype(int)\n", "mlutils.plot_2Ddata_with_boundary(predict, X, y)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Cross entropy - multiclass classification\n", "\n", "follow [THIS EXAMPLE](https://www.tensorflow.org/tutorials/keras/classification) in TensorFlow doc site. Observe that:\n", "\n", "- labels corresponding to a 10-class classification problem\n", "- the network contains 10 output neurons, one per output class\n", "- the loss function is `SparseCategoricalCrossEntropy`\n", "\n", "Observe how **cross entropy** works with 4 classes:\n", "\n", "- first we convert the output to a one-hot encoding\n", "- we create a network with two output neurons with sigmoid activation\n", "- interpret each neuron's output as elements of a probability distribution\n", "- normalize the probability distribution (must add up to one)\n", "- we consider network output is better when it yields more probability to the correct class"]}, {"cell_type": "markdown", "metadata": {}, "source": ["**expected classes for five data points**"]}, {"cell_type": "code", "execution_count": 239, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([3, 1, 2, 0, 3])"]}, "execution_count": 239, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "y = np.random.randint(4, size=5)\n", "y"]}, {"cell_type": "markdown", "metadata": {}, "source": ["**convert it to one hot encoding**"]}, {"cell_type": "code", "execution_count": 240, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[0, 0, 0, 1],\n", " [0, 1, 0, 0],\n", " [0, 0, 1, 0],\n", " [1, 0, 0, 0],\n", " [0, 0, 0, 1]])"]}, "execution_count": 240, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "y_ohe = np.eye(4)[y].astype(int)\n", "y_ohe"]}, {"cell_type": "markdown", "metadata": {}, "source": ["**simulate some neural network output with NO ACTIVATION function**\n", "\n", "with 10 output neurons, so for each input element (we have five) we have 4 outputs.\n", "\n", "this is called **LOGITS** in Tensorflow"]}, {"cell_type": "code", "execution_count": 241, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 0.06, -0.31, -0.95, 0.39],\n", " [ 0.92, -0.48, -0.08, 0.53],\n", " [-0.5 , 0.22, -0.18, 1.81],\n", " [-0.49, -1.41, 0.09, -0.11],\n", " [-0.73, 0.26, -1.63, -0.68]])"]}, "execution_count": 241, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "y_hat = np.round(np.random.normal(size=y_ohe.shape), 2)\n", "y_hat"]}, {"cell_type": "markdown", "metadata": {}, "source": ["**normalize LOGITS**. This is the **SOFTMAX function**\n", "\n", "**LOGITS** obtained from network last layer with no activation\n", "\n", "$$\\hat{\\mathbf{y}}^{(i)} = [\\hat{y}^{(i)}_0, \\hat{y}^{(i)}_1,...,\\hat{y}^{(i)}_9]$$\n", "\n", "**SOFTMAX ACTIVATION**\n", "\n", "$$\\hat{\\bar{\\mathbf{y}}}^{(i)} = [\\hat{\\bar{y}}^{(i)}_0, \\hat{\\bar{y}}^{(i)}_1,...,\\hat{\\bar{y}}^{(i)}_9]$$\n", "\n", "with \n", "\n", "$$\\hat{\\bar{y}}^{(i)}_k = \\frac{e^{\\hat{y}^{(i)}_k}}{\\sum_{j=0}^9e^{\\hat{y}^{(i)}_j}}$$\n", "\n", "\n", "this ensures:\n", "\n", "- $\\sum_{k=0}^9 \\hat{\\bar{y}}^{(i)}_k=1$\n", "- $0 \\le \\hat{\\bar{y}}^{(i)}_k \\le 1$\n", "\n", "this way, for each input we have a nice probability distribution in its outputs.\n", "\n", "This is implemented in **Tensorflow**"]}, {"cell_type": "code", "execution_count": 242, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[0.29019814, 0.20044982, 0.10569567, 0.40365637],\n", " [0.43638904, 0.10761221, 0.16053855, 0.2954602 ],\n", " [0.06893706, 0.14162659, 0.09493514, 0.69450122],\n", " [0.21519991, 0.08576126, 0.38435531, 0.31468351],\n", " [0.19420963, 0.52266365, 0.07895974, 0.20416697]])"]}, "execution_count": 242, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "y_hatb = tf.nn.softmax(y_hat).numpy()\n", "y_hatb"]}, {"cell_type": "markdown", "metadata": {}, "source": ["check sums"]}, {"cell_type": "code", "execution_count": 243, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([1., 1., 1., 1., 1.])"]}, "execution_count": 243, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "y_hatb.sum(axis=1)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["how would you now measure how closely `y_hatb` is to the expected output on `y_ohe`?\n", "\n", "**cross entropy**: just take the probability assigned to the correct class (and pass it through a log function)\n", "\n", "$$\\text{loss}(\\bar{\\mathbf{y}}^{(i)}, \\hat{\\bar{\\mathbf{y}}}^{(i)}) = -\\sum_{k=0}^9 \\bar{y}^{(i)}_k\\log(\\hat{\\bar{y}}^{(i)}_k)$$\n", "\n", "where $\\bar{\\mathbf{y}}^{(i)}$ is the one-hot encoding of the expected class (label) for data point $i$.\n", "\n", "observe that, \n", "\n", "- in the one-hot encoding $\\bar{\\mathbf{y}}^{(i)}$ only one of the elements will be 1, and the rest will be 0's, so the sumation above is only taking the log of the probability of the correct label.\n", "- the negative sign accounts for logs of values <1 are negative and we will later want to **minimize** the loss\n", "\n", "This is implemented in **Tensorflow**"]}, {"cell_type": "code", "execution_count": 253, "metadata": {}, "outputs": [{"data": {"text/plain": [""]}, "execution_count": 253, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "tf.keras.losses.categorical_crossentropy(y_ohe, y_hatb)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Observe that TensorFlow also implements the corresponding **sparse** convenience function that works directly with our labels"]}, {"cell_type": "code", "execution_count": 261, "metadata": {}, "outputs": [{"data": {"text/plain": [""]}, "execution_count": 261, "metadata": {}, "output_type": "execute_result"}], "source": ["## KEEPOUTPUT\n", "tf.keras.losses.sparse_categorical_crossentropy(y, y_hatb)"]}, {"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": []}], "metadata": {"kernelspec": {"display_name": "p38", "language": "python", "name": "p38"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3"}}, "nbformat": 4, "nbformat_minor": 4}